-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config-linux: RFC 2119 tightening for namespaces #767
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
config-linux.md
Outdated
|
||
If `path` is not specified, the runtime MUST create a new [container namespace](glossary.md#container-namespace) of type `type`. | ||
For hierarchical namespaces (e.g. `pid`, `user`), the new container namespace MUST be a child of the [runtime namespace](glossary.md#runtime-namespace) of that type. | ||
For seeded namespaces (e.g. `mount`, `uts`), the new container namespace MUST be seeded by the runtime namespace of that type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why these two restrictions are necessary (who the parent / what the seed should be). On the one hand they are useful for some circumstances but I'm having trouble convincing myself that these requirements are not going make weird restrictions for a non-cli interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand they are useful for some circumstances…
For example, if you ask for a new uts
namespace but do not set the optional hostname
, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace).
This is less of an issue for the mount
namespace, because with root.path
REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount
namespace exciting ;).
… but I'm having trouble convincing myself that these requirements are not going make weird restrictions for a non-cli interface…
We already have some of “runtime namespace” conditions (e.g. for linux.namespaces[].path
which have this “caller may not have a good idea what the runtime namespace actually is” situation. And this is clearly an issue already for namespaces which are not set in linux.namespaces[]
.
I expect runtimes, CLI and otherwise, will document what they consider the runtime namespaces to be to avoid any ambiguity. For a CLI, it would probably, but not necessarily, be the caller's namespaces. For a web UI, it might be:
The runtime mount namespace will only contain a root directory owned by 0:0 with 755 permissions. The runtime UTS namespace will have a
test
hostname. …
But I don't see how this has an impact unique to the “defining the seed namespace” use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are specifying implementation specific functionality vs just telling people what the field in the spec is used for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are specifying implementation specific functionality vs just telling people what the field in the spec is used for.
If we want to explicitly make the parent / seed implementation-defined and not tied to the implementation-defined runtime namespaces, I guess we could do that. But I think “I this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace” should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID
field for (u|g)idMappings
, because you don't know what namespace will be used to interpret the hostID
s.
And while I expect the child/seeded namespaces will be created via a clone(2)
call or an unshare(2)
call, the wording here does not require that. If the kernel grows new ways to create new namespaces, runtimes will be free to use them, as long as they preserve the required relationship between namespaces.
* **`path`** *(string, OPTIONAL)* - an absolute path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace) | ||
* **`path`** *(string, OPTIONAL)* - an absolute path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace). | ||
The runtime MUST place the container process in the namespace associated with that `path`. | ||
The runtime MUST [generate an error](runtime.md#errors) if `path` is not associated with a namespace of type `type`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can drop the text below this line.
@@ -35,11 +35,17 @@ The following parameters can be specified to setup namespaces: | |||
* **`user`** the container will be able to remap user and group IDs from the host to local users and groups within the container. | |||
* **`cgroup`** the container will have an isolated view of the cgroup hierarchy. | |||
|
|||
* **`path`** *(string, OPTIONAL)* - an absolute path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace) | |||
* **`path`** *(string, OPTIONAL)* - an absolute path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace). | |||
The runtime MUST place the container process in the namespace associated with that `path`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
39, 40 and 42 all good to keep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
39, 40 and 42 all good to keep
Done (punting the other lines to #795).
These were contentious [1,2], so they weren't part of the previous commit. I still think we want to say something about these relationships. For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace). This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;). We already have some of "runtime namespace" conditions (e.g. when linux.namespaces[].path is unset), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit. If we want to explicitly make the parent / seed implementation-defined and not tied to the implementation-defined runtime namespaces, I guess we could do that (by rejecting this commit). But I think "I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace" should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs. [1]: opencontainers#767 (comment) [2]: opencontainers#767 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
These were contentious [1,2], so they weren't part of the previous commit. I still think we want to say something about these relationships. We already have some of "runtime namespace" conditions (e.g. when a type is not listed in linux.namespaces[]), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit. # Seeded namespaces For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace). This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;). # Hierarchical namespaces I think "I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace" should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs. # Namespace ownership This is another case where I think specified clarity is essential. A new network namespace will not be very useful if you don't know who owns it. [1]: opencontainers#767 (comment) [2]: opencontainers#767 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
Previously we had no MUST-level runtime requirements for namespace entries in valid configs. This commit attempts to pin those down. I think we want more wording about new namespace creation (what namespace is the seed/parent? Which user namespace owns a runtime namespace? For more background on hierarchical namespaces, see [1]. For more background on the owning user namespace idea, see [2,3,4]), but that wording proved contentious [5,6], so I punted it to [7]. The "'path' not associated with a namespace of type 'type'" condition ensures that runtimes don't blindly call setns(2) on the path without setting nstype nonzero. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788 nsfs: add ioctl to get a parent namespace, 2016-09-06 [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b nsfs: add ioctl to get owning user namespace for ns file descriptor, 2016-09-06 [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36 nsfs: Add an ioctl() to return the namespace type, 2017-01-25 [4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25 [5]: opencontainers#767 (comment) [6]: opencontainers#767 (comment) [7]: opencontainers#795 Signed-off-by: W. Trevor King <wking@tremily.us>
For more background on hierarchical namespaces, see [1]. For more background on the owning user namespace idea, see [2,3,4]). These were contentious [5,6], so they weren't part of the previous commit. I still think we want to say something about these relationships. We already have some of "runtime namespace" conditions (e.g. when a type is not listed in linux.namespaces[]), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit. # Seeded namespaces For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace). This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;). # Hierarchical namespaces I think "I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace" should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs. # Namespace ownership This is another case where I think specified clarity is essential. A new network namespace will not be very useful if you don't know who owns it. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788 nsfs: add ioctl to get a parent namespace, 2016-09-06 [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b nsfs: add ioctl to get owning user namespace for ns file descriptor, 2016-09-06 [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36 nsfs: Add an ioctl() to return the namespace type, 2017-01-25 [4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25 [5]: opencontainers#767 (comment) [6]: opencontainers#767 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
1 similar comment
For more background on hierarchical namespaces, see [1]. For more background on the owning user namespace idea, see [2,3,4]). These were contentious [5,6], so they weren't part of the previous commit. I still think we want to say something about these relationships. We already have some of "runtime namespace" conditions (e.g. when a type is not listed in linux.namespaces[]), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit. # Seeded namespaces For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace). This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;). # Hierarchical namespaces I think "I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace" should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs. # Namespace ownership This is another case where I think specified clarity is essential. A new network namespace will not be very useful if you don't know who owns it. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788 nsfs: add ioctl to get a parent namespace, 2016-09-06 [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b nsfs: add ioctl to get owning user namespace for ns file descriptor, 2016-09-06 [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36 nsfs: Add an ioctl() to return the namespace type, 2017-01-25 [4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25 [5]: opencontainers#767 (comment) [6]: opencontainers#767 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>
For more background on hierarchical namespaces, see [1]. For more background on the owning user namespace idea, see [2,3,4]). These were contentious [5,6], so they weren't part of the previous commit. I still think we want to say something about these relationships. We already have some of "runtime namespace" conditions (e.g. when a type is not listed in linux.namespaces[]), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit. # Seeded namespaces For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace). This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;). # Hierarchical namespaces I think "I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace" should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs. # Namespace ownership This is another case where I think specified clarity is essential. A new network namespace will not be very useful if you don't know who owns it. # Glossary changes In review, Mrunal and others pointed out that the new config-linux entries overlapped with the existing glossary entry [7]. I've addressed that in this commit by trimming the glossary entries down to fix them (namespaces are not all hierarchical and processes aren't always in leaves). I've also dropped the handwavy and incorrect "children" sentence in favor of a link back to config-linux.md and the more specific RFC 2119 language from this commit. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788 nsfs: add ioctl to get a parent namespace, 2016-09-06 [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b nsfs: add ioctl to get owning user namespace for ns file descriptor, 2016-09-06 [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36 nsfs: Add an ioctl() to return the namespace type, 2017-01-25 [4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25 [5]: opencontainers#767 (comment) [6]: opencontainers#767 (comment) [7]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2017/opencontainers.2017-05-10-21.03.log.html#l-115 But I was talking so my log entries are sparse :/. Signed-off-by: W. Trevor King <wking@tremily.us>
Previously we had no MUST-level runtime requirements for namespace entries in valid configs. This commit attempts to pin those down. For more background on hierarchical namespaces, see here. For more background on the owning user namespace idea, see here, here, and here.
The “
path
not associated with a namespace of typetype
” condition ensures that runtimes don't blindly callsetns(2)
on the path without settingnstype
nonzero.Part of #746.
Ping @cyphar, especially on the owning-user-namespace condition.